AI621 A5 : Focal stack & Lightfield

20215584 Wonjoon Chang

Initialization

Load the light field image, and create from it a 5-dimensional array L(u, v, s, t, c).
Each lenslet covers a block of 16x16 pixels.
filename = "./data/chessboard_lightfield.png";
lf = imread(filename);
The original light field image:
U = 16; V = 16;
S = size(lf,1)/U;
T = size(lf,2)/V;
C = size(lf,3);
img5d = zeros(U,V,S,T,C, 'uint8');
for u=1:U
for v=1:V
img5d(u,v,:,:,:) = lf((U-u+1):U:end,v:V:end,:);
end
end
size(img5d)
ans = 1×5
16 16 400 700 3

Sub-aperture views

By rearranging the pixels in the light field image, we can create images that correspond to views of the scene through a pinhole placed at different poins on the camera aperture (with fixed u and v).
Create all of these sub-aperture views, and arrange them into a 2D mosaic.
mosaic = zeros(U*S,V*T,C, 'uint8');
 
for u=1:U
s1 = (U-u)*S+1;
s2 = (U-u+1)*S;
for v=1:V
t1 = (v-1)*T+1;
t2 = v*T;
mosaic(s1:s2,t1:t2,:) = img5d(u,v,:,:,:);
end
end
imshow(mosaic(1:1200,1:2100,:))
imwrite(mosaic,"./result/mosaic.png");
Mosaic of sub-aperture views:

Refocusing and focal-stack generation

Refocus at different depths by combining parts of the light field.
d denotes the depth. For d=0, we obtain the image where the top of the chess board is focused.
img5d = im2double(img5d);
u_centered = (1:U) - 1 - (U-1)/2;
v_centered = (1:V) - 1 - (V-1)/2;
d = -1;
refocused = zeros(S,T,C);
num = zeros(S,T,C);
for u=1:U
ud = u_centered(u)*d;
for v=1:V
vd = v_centered(v)*d;
 
subap = squeeze(img5d(u,v,:,:,:));
% shift sub-aperture
I = imtranslate(subap,[-vd, -ud]);
% count nonzero area
num(I>0) = num(I>0) + 1;
refocused = refocused + I;
end
end
 
refocused = refocused./num;
imshow(refocused)

Refocusing results for different depths

We can easily identify that different areas are focused by shifting sub-aperture. In the next step, we will combine refocused images with d=[0, -0.2, -0.4, -0.6, -0.8, -1.0, -1.2, -1.4, -1.6].
N = 9
N = 9
focal_stack = zeros(N,S,T,C);
 
for i=1:N
d = -(i-1)/5;
refocused = zeros(S,T,C);
num = zeros(S,T,C);
for u=1:U
ud = u_centered(u)*d;
for v=1:V
vd = v_centered(v)*d;
subap = squeeze(img5d(u,v,:,:,:));
I = imtranslate(subap,[-vd, -ud]);
num(I>0) = num(I>0) + 1;
refocused = refocused + I;
end
end
focal_stack(i,:,:,:) = refocused./num;
end
for i=1:N
filename = "./result/refocus-"+i+".png";
imwrite(squeeze(focal_stack(i,:,:,:)), filename);
end

All-focus image and Depth from defocus

To merge the focal stack into a single all-focus image, we first determine per-pixel and per-image weights. Here is the detail procedure:
W = zeros(N,S,T);
 
for i=1:N
img = squeeze(focal_stack(i,:,:,:));
sig1 = 5;
sig2 = 5;
 
img = rgb2xyz(img);
lumi = img(:,:,2);
low = imgaussfilt(lumi, sig1);
high = lumi - low;
sharp = imgaussfilt(high.^2, sig2);
 
W(i,:,:) = sharp;
end
Once we obtain the sharpness weights, we can compute the all-focus image as:
allfocus = zeros(S,T,C);
W_sum = zeros(S,T);
 
for i=1:N
img = squeeze(focal_stack(i,:,:,:));
for c=1:3
allfocus(:,:,c) = allfocus(:,:,c) + squeeze(W(i,:,:)).*img(:,:,c);
end
W_sum = W_sum + squeeze(W(i,:,:));
end
for c=1:3
allfocus(:,:,c) = allfocus(:,:,c)./W_sum;
end
imshow(allfocus)
In addition, we can create a per-pixel depth map:
depth = zeros(S,T);
for i=1:N
d = -(i-1)/5;
depth = depth + squeeze(W(i,:,:))*d;
end
depth = depth./W_sum;
minD = min(depth,[],"all");
maxD = max(depth,[],"all");
depth = (depth - minD)/(maxD - minD);
imshow(depth);

Combining results

I think that the first case (sigma1=5, sigma2=5) is the best. With larger sigma1, the high frequency component may not be emphasized enough and then the result becomes somewhat blurry. With larger sigma2, the depth map becomes too blurry to recognize the objects. With very small sigma1, sigma2, the image and the depth map can have noisy components.

My own images

focals = zeros(4,640,640,3);
for i = 1:4
filename = "./data/focal"+i+".jpeg";
focals(i,:,:,:) = imread(filename);
end
W = zeros(4,640,640);
 
for i=1:4
img = squeeze(focals(i,:,:,:));
sig1 = 2;
sig2 = 10;
 
img = rgb2xyz(img);
lumi = img(:,:,2);
low = imgaussfilt(lumi, sig1);
high = lumi - low;
sharp = imgaussfilt(high.^2, sig2);
 
W(i,:,:) = sharp;
end
 
allfocus = zeros(640,640,3);
W_sum = zeros(640,640);
 
for i=1:4
img = squeeze(focals(i,:,:,:));
for c=1:3
allfocus(:,:,c) = allfocus(:,:,c) + squeeze(W(i,:,:)).*img(:,:,c);
end
W_sum = W_sum + squeeze(W(i,:,:));
end
 
for c=1:3
allfocus(:,:,c) = allfocus(:,:,c)./W_sum;
end
Original images:
sigma1=2, sigma2=10.
In the combined image, we can see that all objects are clearly visible.
maxI = max(allfocus, [], "all");
imshow(allfocus/maxI)